Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Nov 18, 2025

Note: this PR is fully hand-coded, there was no AI involved as most of the copy-paste is better doing by hand.

This PR splits the server.cpp into small components:

  • Main, server.cpp contains the server_context, server_slot and handler functions
  • server-common containing most of the functions from utils.hpp (some one-off functions are moved to server-task)
  • server-task containing all server_task_* classes and subclasses. The main idea is to consider these classes as serializer/deserialize. In the future, most of the JSON handling will be done here (instead of scattering across the code base)
  • server-queue containing implementation of task queue and result queue. The goal is to group all of the mutex-related logic into one file, potentially reusing them for other things in the future (completely decoupled from other parts of server)
flowchart TD
    server_common -.- main
    server_task -.- server_slot
    server_task -.- server_queue
    server_slot <-->|update slots| server_context
    server_queue -->|get task| server_context
    HTTP_handlers -->|post task| server_queue
    subgraph main
        server_slot
        server_context -.- HTTP_handlers
    end
Loading

@ngxson ngxson changed the title server: split server code into main/common/task/queue server: split server.cpp code into server/common/task/queue Nov 18, 2025
@ngxson
Copy link
Collaborator Author

ngxson commented Nov 18, 2025

Ok seems like windows CI are failing due to cloudflare outage: https://www.cloudflarestatus.com/incidents/8gmgl950y3h7

@pwilkin
Copy link
Collaborator

pwilkin commented Nov 18, 2025

Yeah, the CIs are completely unreliable today due to the CloudFlare thingy.

STOP_TYPE_LIMIT,
};

struct task_params {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: slot_params is renamed to task_params as I think it's a more appropriate name

@ngxson ngxson marked this pull request as ready for review November 18, 2025 16:40
@ngxson ngxson requested a review from ggerganov as a code owner November 18, 2025 16:40
Comment on lines +8 to +11

#define JSON_ASSERT GGML_ASSERT
#include <nlohmann/json.hpp>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't be very difficult to include just <nlohmann/json_fwd.hpp> here. For example server-queue.cpp currently indirectly includes nlohmann/json.hpp while it does not need it. We can fix this later.

@ngxson ngxson merged commit b8372ee into ggml-org:master Nov 24, 2025
64 of 65 checks passed
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants